Skip to content

Add batch_size parameter in import_bulk method #207

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 17 commits into from
Jun 24, 2022

Conversation

aMahanna
Copy link
Member

@aMahanna aMahanna commented Jun 14, 2022

Introduces batch insertion via the import_bulk API.

These changes warrant a minor version increment.

Demo

vertices = [{'foo': str(i)} for i in range(1, 101)]
db.create_collection('numbers')
results = db.collection('numbers').import_bulk(vertices, batch_size=len(vertices)//10)
assert len(results) == 10

@aMahanna aMahanna self-assigned this Jun 14, 2022
@aMahanna
Copy link
Member Author

aMahanna commented Jun 14, 2022

Some questions

  1. Does python-arango's readthedocs generate documentation based on anything else but the docstrings? Wondering if I should be aware of any other API documentation that I should be editing accordingly
  2. Curious to know the reason for having both _ensure_key_from_id() and _ensure_key_in_body(). Not currently sure

@aMahanna aMahanna requested a review from joowani June 14, 2022 19:31
@codecov-commenter
Copy link

codecov-commenter commented Jun 14, 2022

Codecov Report

Merging #207 (56d74fd) into main (097f661) will increase coverage by 0.00%.
The diff coverage is 100.00%.

@@           Coverage Diff           @@
##             main     #207   +/-   ##
=======================================
  Coverage   99.86%   99.86%           
=======================================
  Files          26       26           
  Lines        3722     3729    +7     
=======================================
+ Hits         3717     3724    +7     
  Misses          5        5           
Impacted Files Coverage Δ
arango/database.py 100.00% <ø> (ø)
arango/collection.py 99.75% <100.00%> (+<0.01%) ⬆️
arango/utils.py 100.00% <100.00%> (ø)
arango/client.py 100.00% <0.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 097f661...56d74fd. Read the comment docs.

@aMahanna aMahanna requested a review from joowani June 21, 2022 17:54
@aMahanna aMahanna changed the title import_bulk & create_graph rework new: batch_size parameter in import_bulk Jun 21, 2022
aMahanna added 2 commits June 22, 2022 08:33
(always return result in `list` format if `batch_size` is specified)
@aMahanna aMahanna requested a review from joowani June 22, 2022 13:21

results.append(self._execute(request, response_handler))

return results[0] if batch_size is None else results
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

@aMahanna aMahanna Jun 23, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, addressed in c65af2a

@aMahanna aMahanna requested a review from joowani June 23, 2022 00:45
@joowani
Copy link
Contributor

joowani commented Jun 24, 2022

Some questions

  1. Does python-arango's readthedocs generate documentation based on anything else but the docstrings? Wondering if I should be aware of any other API documentation that I should be editing accordingly
  2. Curious to know the reason for having both _ensure_key_from_id() and _ensure_key_in_body(). Not currently sure
  1. Only the API specifications are generated from the docstrings. Take a look at the contents in docs directory for the rest.
  2. Honestly I don't remember anymore haha. The functions do slightly different things, so I guess I wanted to run only the code that was strictly necessary (since they run per document which means a lot).

@joowani joowani changed the title new: batch_size parameter in import_bulk Add batch_size parameter in import_bulk method Jun 24, 2022
@joowani joowani merged commit 917f699 into main Jun 24, 2022
@joowani joowani deleted the feature/import-bulk-and-create-graph-rework branch June 24, 2022 07:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants